Cross-Lingual Link Discovery between Chinese and English Wiki Knowledge Bases
نویسندگان
چکیده
Wikipedia is an online multilingual encyclopedia that contains a very large number of articles covering most written languages. However, one critical issue for Wikipedia is that the pages in different languages are rarely linked except for the cross-lingual link between pages about the same subject. This could pose serious difficulties to humans and machines who try to seek information from different lingual sources. In order to address above issue, we propose a hybrid approach that exploits anchor strength, topic relevance and entity knowledge graph to automatically discovery cross-lingual links. In addition, we develop CELD, a system for automatically linking key terms in Chinese documents with English Concepts. As demonstrated in the experiment evaluation, the proposed model outperforms several baselines on the NTCIR data set, which has been designed especially for the cross-lingual link discovery evaluation.
منابع مشابه
UKP at CrossLink: Anchor Text Translation for Cross-lingual Link Discovery
This paper describes UKP’s participation in the cross-lingual link discovery (CLLD) task at NTCIR-9. The given task is to find valid anchor texts from a new English Wikipedia page and retrieve the corresponding target Wiki pages in Chinese, Japanese, and Korean languages. We have developed a CLLD framework consisting of anchor selection, anchor ranking, anchor translation, and target discovery ...
متن کاملCross-Lingual Knowledge Discovery: Chinese-to-English Article Linking in Wikipedia
In this paper we examine automated Chinese to English link discovery in Wikipedia and the effects of Chinese segmentation and Chinese to English translation on the hyperlink recommendation. Our experimental results show that the implemented link discovery framework can effectively recommend Chinese-toEnglish cross-lingual links. The techniques described here can assist bi-lingual users where a ...
متن کاملSimple Yet Effective Methods for Cross-Lingual Link Discovery (CLLD) - KMI @ NTCIR-10 CrossLink-2
Cross-Lingual Link Discovery (CLLD) aims to automatically find links between documents written in different languages. In this paper, we first present a relatively simple yet effective methods for CLLD in Wiki collections, explaining the findings that motivated their design. Our methods (team KMI) achieved in the NTCIR-10 CrossLink-2 evaluation the best overall results in the English to Chinese...
متن کاملOverview of the NTCIR-9 Crosslink Task: Cross-lingual Link Discovery
This paper presents an overview of NTCIR-9 Cross-lingual Link Discovery (Crosslink) task. The overview includes: the motivation of cross-lingual link discovery; the Crosslink task definition; the run submission specification; the assessment and evaluation framework; the evaluation metrics; and the evaluation results of submitted runs. Cross-lingual link discovery (CLLD) is a way of automaticall...
متن کاملMulti-filtering Method Based Cross-lingual Link Discovery
This paper describes cross-lingual link discovery method of ISTIC used in the system evaluation task at NTCIR-9. In this year's evaluation, we participated in cross-lingual link discovery task from English to Chinese. In this paper, we mainly describe our understanding for CLLD, the key techniques of our system, and the evaluation results.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013